Jointly Parse and Fragment Ungrammatical Sentences

نویسندگان

  • Homa B. Hashemi
  • Rebecca Hwa
چکیده

This paper is about detecting incorrect arcs in a dependency parse for sentences that contain grammar mistakes. Pruning these arcs results in well-formed parse fragments that can still be useful for downstream applications. We propose two automatic methods that jointly parse the ungrammatical sentence and prune the incorrect arcs: a parser retrained on a parallel corpus of ungrammatical sentences with their corrections, and a sequence-to-sequence method. Experimental results show that the proposed strategies are promising for detecting incorrect syntactic dependencies as well as incorrect semantic dependencies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parse Tree Fragmentation of Ungrammatical Sentences

Ungrammatical sentences present challenges for statistical parsers because the well-formed trees they produce may not be appropriate for these sentences. We introduce a framework for reviewing the parses of ungrammatical sentences and extracting the coherent parts whose syntactic analyses make sense. We call this task parse tree fragmentation. In this paper, we propose a training methodology fo...

متن کامل

An Evaluation of Parser Robustness for Ungrammatical Sentences

For many NLP applications that require a parser, the sentences of interest may not be well-formed. If the parser can overlook problems such as grammar mistakes and produce a parse tree that closely resembles the correct analysis for the intended sentence, we say that the parser is robust. This paper compares the performances of eight state-of-the-art dependency parsers on two domains of ungramm...

متن کامل

Parsing Ungrammatical Input: an Evaluation Procedure

This paper presents a procedure for evaluating a parser’s ability to produce an accurate parse for an ungrammatical sentence. It is based on the existence of a corpus of ungrammatical sentences, and a parallel corpus containing corrected, and hence grammatical, versions of the sentences in the first corpus. This procedure is applied to a wide-coverage probabilistic parser (Charniak, 2000), and ...

متن کامل

The effect of correcting grammatical errors on parse probabilities

We parse the sentences in three parallel error corpora using a generative, probabilistic parser and compare the parse probabilities of the most likely analyses for each grammatical sentence and its closely related ungrammatical counterpart.

متن کامل

Parser Features for Sentence Grammaticality Classification

Automatically judging sentences for their grammaticality is potentially useful for several purposes — evaluating language technology systems, assessing language competence of second or foreign language learners, and so on. Previous work has examined parser ‘byproducts’, in particular parse probabilities, to distinguish grammatical sentences from ungrammatical ones. The aim of the present paper ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017